14 research outputs found

    Manticore: Hardware-Accelerated RTL Simulation with Static Bulk-Synchronous Parallelism

    Full text link
    The demise of Moore's Law and Dennard Scaling has revived interest in specialized computer architectures and accelerators. Verification and testing of this hardware heavily uses cycle-accurate simulation of register-transfer-level (RTL) designs. The best software RTL simulators can simulate designs at 1--1000~kHz, i.e., more than three orders of magnitude slower than hardware. Faster simulation can increase productivity by speeding design iterations and permitting more exhaustive exploration. One possibility is to use parallelism as RTL exposes considerable fine-grain concurrency. However, state-of-the-art RTL simulators generally perform best when single-threaded since modern processors cannot effectively exploit fine-grain parallelism. This work presents Manticore: a parallel computer designed to accelerate RTL simulation. Manticore uses a static bulk-synchronous parallel (BSP) execution model to eliminate runtime synchronization barriers among many simple processors. Manticore relies entirely on its compiler to schedule resources and communication. Because RTL code is practically free of long divergent execution paths, static scheduling is feasible. Communication and synchronization no longer incur runtime overhead, enabling efficient fine-grain parallelism. Moreover, static scheduling dramatically simplifies the physical implementation, significantly increasing the potential parallelism on a chip. Our 225-core FPGA prototype running at 475 MHz outperforms a state-of-the-art RTL simulator on an Intel Xeon processor running at ≈\approx 3.3 GHz by up to 27.9×\times (geomean 5.3×\times) in nine Verilog benchmarks

    IMPACT: Interval-based Multi-pass Proteomic Alignment with Constant Traceback

    No full text
    Darwin is a genomics co-processor that achieved a 15000x acceleration on long read assembly through innovative hardware and algorithm co-design. Darwins algorithms and hardware implementation were specifically designed for DNA analysis pipelines. This paper analyzes the feasibility of applying Darwins algorithms to the problem of protein sequence alignment. In addition to a behavioral analysis of Darwin when aligning proteins, we propose an algorithmic improvement to Darwins alignment algorithm, GACT, in the form of a multi-pass variant that increases its accuracy on protein sequence alignment. Concretely, our proposed multi-pass variant of GACT achieves on average 14\% better alignment scores

    Fundamentals of system-on-chip design on arm cortex-M microcontrollers

    No full text
    This textbook aims to provide learners with an understanding of embedded systems built around Arm Cortex-M processor cores, a popular CPU architecture often used in modern low-power SoCs that target IoT applications. Readers will be introduced to the basic principles of an embedded system from a high-level hardware and software perspective and will then be taken through the fundamentals of microcontroller architectures and SoC-based designs. Along the way, key topics such as chip design, the features and benefits of Arm’s Cortex-M processor architectures (including TrustZone, CMSIS and AMBA), interconnects, peripherals and memory management are discussed. The material covered in this book can be considered as key background for any student intending to major in computer engineering and is suitable for use in an undergraduate course on digital design

    Melting of a phase change material in a horizontal annulus with discrete heat sources

    No full text
    Phase change materials have found many industrial applications such as cooling of electronic devices and thermal energy storage. This paper investigates numerically the melting process of a phase change material in a two-dimensional horizontal annulus with different arrangements of two discrete heat sources. The sources are positioned on the inner cylinder of the annulus and assumed as constant-temperature boundary conditions. The remaining portion of the inner cylinder wall as well as the outer cylinder wall is considered to be insulated. The emphasis is mainly on the effects of the arrangement of the heat source pair on the fluid flow and heat transfer features. The governing equations are solved on a non-uniform O type mesh using a pressure-based finite volume method with an enthalpy porosity technique to trace the solid and liquid interface. The results are obtained at Ra=104 and presented in terms of streamlines, isotherms, melting phase front, liquid fraction and dimensionless heat flux. It is observed that, depending on the arrangement of heat sources, the liquid fraction increases both linearly and non-linearly with time but will slow down at the end of the melting process. It can also be concluded that proper arrangement of discrete heat sources has the great potential in improving the energy storage system. For instance, the arrangement C3 where the heat sources are located on the bottom part of the inner cylinder wall can expedite the melting process as compared to the other arrangements

    A Dynamically Reconfigurable Platform for High-Performance and Low-Power On-Board Processing

    No full text
    FPGAs (Field Programmable Gate Array) are an attractive technology for high-speed data processing in space missions due to their unbeatable flexibility and best performance-to-power ratio in comparison to software. However FPGAs suffer from 3 major drawbacks: (1) higher programming effort is required with respect to software; (2) hardware resources need to be allocated for each implemented function in contrast to software functions which can be executed on the same processing hardware; and (3) FPGAs are required to adopt radiation hardening techniques when deployed in a space environment. This paper presents a reconfigurable platform that demonstrates how modern FPGAs can be considered as computing resources like any other, suitable for emerging spatial applications and not subjected to the above-mentioned drawbacks. In particular, we show that large FPGAs can be split in different regions containing concurrently-running accelerators which can support the execution of a single or multiple applications. Then, in the same way as software-based multiprogrammed and multithreaded systems can dynamically create, schedule and execute threads, FPGA-based accelerators can be swapped in and out according to scheduling needs by exploiting their dynamic partial reconfiguration capability. A proof of concept cloud detection algorithm for Sentinel-2 multispectral images has been implemented and tested on our platform to validate the system's design principles and performance

    Numerical study of melting in an annulur enclosure filled with nano-enhanced phase change material

    No full text
    Heat transfer enhancement during melting in a two-dimensional cylindrical annulus through dispersion of nanoparticle is investigated numerically. Paraffin-based nanofluid containing various volume fractions of Cu is applied. The governing equations are solved on a non-uniform O type mesh using a pressure-based finite volume method with an enthalpy porosity technique to trace the solid and liquid interface. The effects of nanoparticle dispersion into pure fluid as well as the influences of some significant parameters, namely, nanoparticle volume fraction and natural convection on the fluid flow and heat transfer features are studied. The results are presented in terms of streamlines, isotherms, temperatures and velocity profiles and dimensionless heat flux. It is found that the suspended nanoparticles give rise to the higher thermal conductivity as compared to the pure fluid and consequently the heat transfer is enhanced. In addition, the heat transfer rate and the melting time increases and decreases, respectively, as the volume fraction of nanoparticle increases

    Melting of a phase change material in a horizontal annulus with discrete heat sources

    No full text
    Phase change materials have found many industrial applications such as cooling of electronic devices and thermal energy storage. This paper investigates numerically the melting process of a phase change material in a two-dimensional horizontal annulus with different arrangements of two discrete heat sources. The sources are positioned on the inner cylinder of the annulus and assumed as constant-temperature boundary conditions. The remaining portion of the inner cylinder wall as well as the outer cylinder wall is considered to be insulated. The emphasis is mainly on the effects of the arrangement of the heat source pair on the fluid flow and heat transfer features. The governing equations are solved on a non-uniform O type mesh using a pressure-based finite volume method with an enthalpy porosity technique to trace the solid and liquid interface. The results are obtained at Ra=104 and presented in terms of streamlines, isotherms, melting phase front, liquid fraction and dimensionless heat flux. It is observed that, depending on the arrangement of heat sources, the liquid fraction increases both linearly and non-linearly with time but will slow down at the end of the melting process. It can also be concluded that proper arrangement of discrete heat sources has the great potential in improving the energy storage system. For instance, the arrangement C3 where the heat sources are located on the bottom part of the inner cylinder wall can expedite the melting process as compared to the other arrangements

    CytoKavosh: a cytoscape plug-in for finding network motifs in large biological networks.

    Get PDF
    Network motifs are small connected sub-graphs that have recently gathered much attention to discover structural behaviors of large and complex networks. Finding motifs with any size is one of the most important problems in complex and large networks. It needs fast and reliable algorithms and tools for achieving this purpose. CytoKavosh is one of the best choices for finding motifs with any given size in any complex network. It relies on a fast algorithm, Kavosh, which makes it faster than other existing tools. Kavosh algorithm applies some well known algorithmic features and includes tricky aspects, which make it an efficient algorithm in this field. CytoKavosh is a Cytoscape plug-in which supports us in finding motifs of given size in a network that is formerly loaded into the Cytoscape work-space (directed or undirected). High performance of CytoKavosh is achieved by dynamically linking highly optimized functions of Kavosh's C++ to the Cytoscape Java program, which makes this plug-in suitable for analyzing large biological networks. Some significant attributes of CytoKavosh is efficiency in time usage and memory and having no limitation related to the implementation in motif size. CytoKavosh is implemented in a visual environment Cytoscape that is convenient for the users to interact and create visual options to analyze the structural behavior of a network. This plug-in can work on any given network and is very simple to use and generates graphical results of discovered motifs with any required details. There is no specific Cytoscape plug-in, specific for finding the network motifs, based on original concept. So, we have introduced for the first time, CytoKavosh as the first plug-in, and we hope that this plug-in can be improved to cover other options to make it the best motif-analyzing tool

    Control Panel of CytoKavosh, including the CytoKavosh control tab for getting input parameters.

    No full text
    <p>The right side of the figure shows the ‘results’ table panel after running the CytoKavosh for given input parameters. A table for each run of plug-in appears in the separate tab in ‘result’ panel. These tabs keep the results until finishing the plug-in. For larger sizes of the motifs, the number of detected motifs increases exponentially. So, the ‘results’ table can be explored page by page. The below panel shows the graphical representation of selected motif in the table.</p
    corecore